Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems
نویسندگان
چکیده
Computer simulations of realistic applications usually require solving a set of non-linear partial di erential equations (PDEs) over a nite region. The process of obtaining numerical solutions to the governing PDEs involves solving large sparse linear or eigen systems over the unstructured meshes that model the underlying physical objects. These systems are often solved iteratively, where the sparse matrix-vector multiply (SPMV) is the most expensive operation within each iteration. In this paper, we focus on the e ciency of SPMV using various ordering/partitioning algorithms. We examine di erent implementations using three leading programming paradigms and architectures. Results show that ordering greatly improves performance, and that cache reuse can be more important than reducing communication. However, a multithreaded implementation indicates that ordering and partitioning are not required on the Tera MTA to obtain an e cient and scalable SPMV.
منابع مشابه
Paper Submitted To: Third National Symposium on Large-scale Structural Analysis for High-performance Computers and Workstations Computational Results for Parallel Unstructured Mesh Computations
The majority of nite element models in structural engineering are composed of unstructured meshes. These unstructured meshes are often very large and require signiicant computational resources; hence they are excellent candidates for massively parallel computation. Parallel solution of the sparse matrices that arise from such meshes has been studied heavily, and many good algorithms have been d...
متن کاملParallel Library for Unstructured Mesh Problems (PLUMP)
The growing class of applicationswhich solve partial differential equations (PDEs) on unstructured adaptive meshes is considered. Solution to such sparse, non-symmetric and in most cases ill-conditioned systems is often obtained using iterative methods. This paper describes the specification and implementationof the Parallel Library for Unstructured Mesh Problems (PLUMP), which supports the tra...
متن کاملPerformance comparison of data-reordering algorithms for sparse matrix-vector multiplication in edge-based unstructured grid computations
Several performance improvements for finite-element edge-based sparse matrix–vector multiplication algorithms on unstructured grids are presented and tested. Edge data structures for tetrahedral meshes and triangular interface elements are treated, focusing on nodal and edges renumbering strategies for improving processor and memory hierarchy use. Benchmark computations on Intel Itanium 2 and P...
متن کاملMapping Unstructured Grid Computations to Massively Parallel Computers
This thesisinvestigatesthe mappingproblem: assignthe tasksof a parallel program to the processorsof a parallel computer suchthat the execution time is minimized. First, a taxonomy of objective functions and heuristics usedto solvethe mapping problem is presented. Next, we develop a highly parallel heuristic mapping algorithm, called Cyclic Pairwise Exchange (CPE), and discuss its place in the t...
متن کاملScalability of parallel finite element algorithms on multi-core platforms
The speedup of element-by-element FEM algorithms depends not only on peak processor performance but also on access time to shared mesh data. Eliminating memory boundness would significantly speed up unstructured mesh computations on hybrid multi-core architectures, where the gap between processor and memory performance continues to grow. The speedup can be achieved by ordering unknowns so that ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000